home *** CD-ROM | disk | FTP | other *** search
Text File | 1992-08-06 | 50.0 KB | 1,192 lines |
- Info file: cmu-user.info, -*-Text-*-
- produced by latexinfo-format-buffer
- from file: cmu-user.tex
-
-
- File: cmu-user.info Node: Inline Expansion Recording, Prev: Inline Expansion, Up: Inline Expansion, Next: Semi-Inline Expansion
-
- Inline Expansion Recording
- --------------------------
-
-
- Inline expansion requires that the source for the inline expanded
- function to be available when calls to the function are compiled. The
- compiler doesn't remember the inline expansion for every function, since
- that would take an excessive about of space. Instead, the programmer
- must tell the compiler to record the inline expansion before the
- definition of the inline expanded function is compiled. This is done by
- globally declaring the function inline before the function is defined,
- by using the `inline' and `extensions:maybe-inline' (?) declarations.
-
- In addition to recording the inline expansion of inline functions at the
- time the function is compiled, `compile-file' also puts the inline
- expansion in the output file. When the output file is loaded, the
- inline expansion is made available for subsequent compilations; there is
- no need to compile the definition again to record the inline expansion.
-
- If a function is declared inline, but no expansion is recorded, then the
- compiler will give an efficiency note like:
-
- Note: MYFUN is declared inline, but has no expansion.
-
- When you get this note, check that the `inline' declaration and the
- definition appear before the calls that are to be inline expanded. This
- note will also be given if the inline expansion for a `defun' could not
- be recorded because the `defun' was in a non-null lexical environment.
-
- File: cmu-user.info Node: Semi-Inline Expansion, Prev: Inline Expansion Recording, Up: Inline Expansion, Next: The Maybe-Inline Declaration
-
- Semi-Inline Expansion
- ---------------------
-
-
- Python supports SEMI-INLINE functions. Semi-inline expansion shares a
- single copy of a function across all the calls in a component by
- converting the inline expansion into a local function (*Note Local
- Call::.) This takes up less space when there are multiple calls, but
- also provides less opportunity for context dependent optimization. When
- there is only one call, the result is identical to normal inline
- expansion. Semi-inline expansion is done when the `space' optimization
- quality is `0', and the function has been declared
- `extensions:maybe-inline'.
-
- This mechanism of inline expansion combined with local call also allows
- recursive functions to be inline expanded. If a recursive function is
- declared `inline', calls will actually be compiled semi-inline.
- Although recursive functions are often so complex that there is little
- advantage to semi-inline expansion, it can still be useful in the same
- sort of cases where normal inline expansion is especially advantageous,
- i.e. functions where the calling context can help a lot.
-
- File: cmu-user.info Node: The Maybe-Inline Declaration, Prev: Semi-Inline Expansion, Up: Inline Expansion
-
- The Maybe-Inline Declaration
- ----------------------------
-
-
- The `extensions:maybe-inline' declaration is a CMU Common Lisp
- extension. It is similar to `inline', but indicates that inline
- expansion may sometimes be desirable, rather than saying that inline
- expansion should almost always be done. When used in a global
- declaration, `extensions:maybe-inline' causes the expansion for the
- named functions to be recorded, but the functions aren't actually inline
- expanded unless `space' is `0' or the function is eventually (perhaps
- locally) declared `inline'.
-
- Use of the `extensions:maybe-inline' declaration followed by the `defun' is
- preferable to the standard idiom of:
-
- (proclaim '(inline myfun))
- (defun myfun () ...)
- (proclaim '(notinline myfun))
-
- ;;; Any calls to `myfun' here are not inline expanded.
-
- (defun somefun ()
- (declare (inline myfun))
- ;;
- ;; Calls to `myfun' here are inline expanded.
- ...)
-
- The problem with using `notinline' in this way is that in CMU Common
- Lisp it does more than just suppress inline expansion, it also forbids
- the compiler to use any knowledge of `myfun' until a later `inline'
- declaration overrides the `notinline'. This prevents compiler warnings
- about incorrect calls to the function, and also prevents block
- compilation.
-
- The `extensions:maybe-inline' declaration is used like this:
-
- (proclaim '(extensions:maybe-inline myfun))
- (defun myfun () ...)
-
- ;;; Any calls to `myfun' here are not inline expanded.
-
- (defun somefun ()
- (declare (inline myfun))
- ;;
- ;; Calls to `myfun' here are inline expanded.
- ...)
-
- (defun someotherfun ()
- (declare (optimize (space 0)))
- ;;
- ;; Calls to `myfun' here are expanded semi-inline.
- ...)
-
- In this example, the use of `extensions:maybe-inline' causes the
- expansion to be recorded when the `defun' for `somefun' is compiled, and
- doesn't waste space through doing inline expansion by default. Unlike
- `notinline', this declaration still allows the compiler to assume that
- the known definition really is the one that will be called when giving
- compiler warnings, and also allows the compiler to do semi-inline
- expansion when the policy is appropriate.
-
- When the goal is merely to control whether inline expansion is done by
- default, it is preferable to use `extensions:maybe-inline' rather than
- `notinline'. The `notinline' declaration should be reserved for those
- special occasions when a function may be redefined at run-time, so the
- compiler must be told that the obvious definition of a function is not
- necessarily the one that will be in effect at the time of the call.
-
-
- File: cmu-user.info Node: Object Representation, Prev: Inline Expansion, Up: Advanced Compiler Use and Efficiency Hints, Next: Numbers
-
- Object Representation
- =====================
-
-
- A somewhat subtle aspect of writing efficient CMU Common Lisp programs
- is choosing the correct data structures so that the underlying objects
- can be implemented efficiently. This is partly because of the need for
- multiple representations for a given value (?), but is also due to the
- sheer number of object types that CMU Common Lisp has built in. The
- number of possible representations complicates the choice of a good
- representation because semantically similar objects may vary in their
- efficiency depending on how the program operates on them.
-
- * Menu:
-
- * Think Before You Use a List::
- * Structures::
- * Arrays::
- * Vectors::
- * Bit-Vectors::
- * Hashtables::
-
-
- File: cmu-user.info Node: Think Before You Use a List, Prev: Object Representation, Up: Object Representation, Next: Structures
-
- Think Before You Use a List
- ---------------------------
-
-
- Although Lisp's creator seemed to think that it was for LISt Processing,
- the astute observer may have noticed that the chapter on list
- manipulation makes up less that three percent of Common Lisp: the
- Language II. The language has grown since Lisp 1.5 -- new data types
- supersede lists for many purposes.
-
- File: cmu-user.info Node: Structures, Prev: Think Before You Use a List, Up: Object Representation, Next: Arrays
-
- Structures
- ----------
-
- One of the best ways of building complex data structures is to define
- appropriate structure types using defstruct. In Python, access of
- structure slots is always at least as fast as list or vector access, and
- is usually faster. In comparison to a list representation of a tuple,
- structures also have a space advantage.
-
- Even if structures weren't more efficient than other representations, structure
- use would still be attractive because programs that use structures in
- appropriate ways are much more maintainable and robust than programs written
- using only lists. For example:
-
- (rplaca (caddr (cadddr x)) (caddr y))
-
- could have been written using structures in this way:
-
- (setf (beverage-flavor (astronaut-beverage x)) (beverage-flavor y))
-
- The second version is more maintainable because it is easier to
- understand what it is doing. It is more robust because structures
- accesses are type checked. An `astronaut' will never be confused with a
- `beverage', and the result of `beverage-flavor' is always a flavor. See
- sections *Note Structure Types:: and *Note The Freeze-Type Declaration::
- for more information about structure types. *Note Type Inference:: for
- a number of examples that make clear the advantages of structure typing.
-
- Note that the structure definition should be compiled before any uses of
- its accessors or type predicate so that these function calls can be
- efficiently open-coded.
-
- File: cmu-user.info Node: Arrays, Prev: Structures, Up: Object Representation, Next: Vectors
-
- Arrays
- ------
-
-
- Arrays are often the most efficient representation for collections of objects
- because:
-
- * Array representations are often the most compact. An array is
- always more compact than a list containing the same number of
- elements.
-
- * Arrays allow fast constant-time access.
-
- * Arrays are easily destructively modified, which can reduce consing.
-
- * Array element types can be specialized, which reduces both overall size and
- consing (?.)
-
-
-
- Access of arrays that are not of type `simple-array' is less efficient,
- so declarations are appropriate when an array is of a simple type like
- `simple-string' or `simple-bit-vector'. Arrays are almost always
- simple, but the compiler may not be able to prove simpleness at every
- use. The only way to get a non-simple array is to use the
- :displaced-to, :fill-pointer or `adjustable' arguments to `make-array'.
- If you don't use these hairy options, then arrays can always be declared
- to be simple.
-
- Because of the many specialized array types and the possibility of
- non-simple arrays, array access is much like generic arithmetic (?). In
- order for array accesses to be efficiently compiled, the element type
- and simpleness of the array must be known at compile time. If there is
- inadequate information, the compiler is forced to call a generic array
- access routine. You can detect inefficient array accesses by enabling
- efficiency notes, ?.
-
- File: cmu-user.info Node: Vectors, Prev: Arrays, Up: Object Representation, Next: Bit-Vectors
-
- Vectors
- -------
-
-
- Vectors (one dimensional arrays) are particularly useful, since in
- addition to their obvious array-like applications, they are also well
- suited to representing sequences. In comparison to a list
- representation, vectors are faster to access and take up between two and
- sixty-four times less space (depending on the element type.) As with
- arbitrary arrays, the compiler needs to know that vectors are not
- complex, so you should use `simple-string' in preference to `string',
- etc.
-
- The only advantage that lists have over vectors for representing
- sequences is that it is easy to change the length of a list, add to it
- and remove items from it. Likely signs of archaic, slow lisp code are
- `nth' and `nthcdr'. If you are using these functions you should
- probably be using a vector.
-
- File: cmu-user.info Node: Bit-Vectors, Prev: Vectors, Up: Object Representation, Next: Hashtables
-
- Bit-Vectors
- -----------
-
-
- Another thing that lists have been used for is set manipulation. In
- applications where there is a known, reasonably small universe of items
- bit-vectors can be used to improve performance. This is much less
- convenient than using lists, because instead of symbols, each element in
- the universe must be assigned a numeric index into the bit vector.
- Using a bit-vector will nearly always be faster, and can be tremendously
- faster if the number of elements in the set is not small. The logical
- operations on `simple-bit-vector's are efficient, since they operate on
- a word at a time.
-
-
- File: cmu-user.info Node: Hashtables, Prev: Bit-Vectors, Up: Object Representation
-
- Hashtables
- ----------
-
-
- Hashtables are an efficient and general mechanism for maintaining
- associations such as the association between an object and its name.
- Although hashtables are usually the best way to maintain associations,
- efficiency and style considerations sometimes favor the use of an
- association list (a-list).
-
- `assoc' is fairly fast when the TEST argument is `eq' or `eql' and there
- are only a few elements, but the time goes up in proportion with the
- number of elements. In contrast, the hash-table lookup has a somewhat
- higher overhead, but the speed is largely unaffected by the number of
- entries in the table. For an `equal' hash-table or alist, hash-tables
- have an even greater advantage, since the test is more expensive.
- Whatever you do, be sure to use the most restrictive test function
- possible.
-
- The style argument observes that although hash-tables and alists
- overlap in function, they do not do all things equally well.
-
- * Alists are good for maintaining scoped environments. They were
- originally invented to implement scoping in the Lisp interpreter,
- and are still used for this in Python. With an alist one can
- non-destructively change an association simply by consing a new
- element on the front. This is something that cannot be done with
- hash-tables.
-
- * Hashtables are good for maintaining a global association. The
- value associated with an entry can easily be changed with `setf'.
- With an alist, one has to go through contortions, either
- `rplacd''ing the cons if the entry exists, or pushing a new one if
- it doesn't. The side-effecting nature of hash-table operations is
- an advantage here.
-
-
-
- Historically, symbol property lists were often used for global name
- associations. Property lists provide an awkward and error-prone
- combination of name association and record structure. If you must use
- the property list, please store all the related values in a single
- structure under a single property, rather than using many properties.
- This makes access more efficient, and also adds a modicum of typing and
- abstraction. *Note More About Types in Python:: for information on
- types in CMU Common Lisp.
-
-
- File: cmu-user.info Node: Numbers, Prev: Object Representation, Up: Advanced Compiler Use and Efficiency Hints, Next: General Efficiency Hints
-
- Numbers
- =======
-
-
- Numbers are interesting because numbers are one of the few Common Lisp
- data types that have direct support in conventional hardware. If a
- number can be represented in the way that the hardware expects it, then
- there is a big efficiency advantage.
-
- Using hardware representations is problematical in Common Lisp due to dynamic typing
- (where the type of a value may be unknown at compile time.) It is possible to
- compile code for statically typed portions of a Common Lisp program with efficiency
- comparable to that obtained in statically typed languages such as C, but not
- all Common Lisp implementations succeed. There are two main barriers to efficient
- numerical code in Common Lisp:
-
- * The compiler must prove that the numerical expression is in fact statically
- typed, and
-
- * The compiler must be able to somehow reconcile the conflicting
- demands of the hardware mandated number representation with the
- Common Lisp requirements of dynamic typing and garbage-collecting
- dynamic storage allocation.
-
-
- Because of its type inference (*Note Type Inference::) and efficiency
- notes (?), Python is better than conventional Common Lisp compilers at
- ensuring that numerical expressions are statically typed. Python also
- goes somewhat farther than existing compilers in the area of allowing
- native machine number representations in the presence of garbage
- collection.
-
- * Menu:
-
- * Descriptors::
- * Non-Descriptor Representations::
- * Variables::
- * Generic Arithmetic::
- * Fixnums::
- * Word Integers::
- * Floating Point Efficiency::
- * Specialized Arrays::
- * Interactions With Local Call::
- * Representation of Characters::
-
-
- File: cmu-user.info Node: Descriptors, Prev: Numbers, Up: Numbers, Next: Non-Descriptor Representations
-
- Descriptors
- -----------
-
-
- Common Lisp's dynamic typing requires that it be possible to represent any value
- with a fixed length object, known as a DESCRIPTOR. This fixed-length
- requirement is implicit in features such as:
-
- * Data types (like `simple-vector') that can contain any type of object, and
- that can be destructively modified to contain different objects (of possibly
- different types.)
-
- * Functions that can be called with any type of argument, and that
- can be redefined at run time.
-
-
- In order to save space, a descriptor is invariably represented as a
- single word. Objects that can be directly represented in the descriptor
- itself are said to be IMMEDIATE. Descriptors for objects larger than
- one word are in reality pointers to the memory actually containing the
- object.
-
- Representing objects using pointers has two major disadvantages:
-
- * The memory pointed to must be allocated on the heap, so it must
- eventually be freed by the garbage collector. Excessive heap
- allocation of objects (or "consing") is inefficient in several
- ways. ?.
-
- * Representing an object in memory requires the compiler to emit
- additional instructions to read the actual value in from memory,
- and then to write the value back after operating on it.
-
-
- The introduction of garbage collection makes things even worse, since the
- garbage collector must be able to determine whether a descriptor is an
- immediate object or a pointer. This requires that a few bits in each
- descriptor be dedicated to the garbage collector. The loss of a few bits
- doesn't seem like much, but it has a major efficiency implication -- objects
- whose natural machine representation is a full word (integers and
- single-floats) cannot have an immediate representation. So the compiler is
- forced to use an unnatural immediate representation (such as `fixnum') or a
- natural pointer representation (with the attendant consing overhead.)
-
-
- File: cmu-user.info Node: Non-Descriptor Representations, Prev: Descriptors, Up: Numbers, Next: Variables
-
- Non-Descriptor Representations
- ------------------------------
-
-
- From the discussion above, we can see that the standard descriptor
- representation has many problems, the worst being number consing.
- Common Lisp compilers try to avoid these descriptor efficiency problems by using
- NON-DESCRIPTOR representations. A compiler that uses non-descriptor
- representations can compile this function so that it does no number consing:
-
- (defun multby (vec n)
- (declare (type (simple-array single-float (*)) vec)
- (single-float n))
- (dotimes (i (length vec))
- (setf (aref vec i)
- (* n (aref vec i)))))
-
- If a descriptor representation were used, each iteration of the loop
- might cons two floats and do three times as many memory references.
-
- As its negative definition suggests, the range of possible
- non-descriptor representations is large. The performance improvement
- from non-descriptor representation depends upon both the number of types
- that have non-descriptor representations and the number of contexts in
- which the compiler is forced to use a descriptor representation.
-
- Many Common Lisp compilers support non-descriptor representations for float types
- such as `single-float' and `double-float' (section ?.)
- Python adds support for full word integers (?),
- characters (?) and system-area pointers (unconstrained
- pointers, ?.) Many Common Lisp compilers
- support non-descriptor representations for variables (section
- ?) and array elements (section ?.)
- Python adds support for non-descriptor arguments and return values in local
- call (?).
- File: cmu-user.info Node: Variables, Prev: Non-Descriptor Representations, Up: Numbers, Next: Generic Arithmetic
-
- Variables
- ---------
-
-
- In order to use a non-descriptor representation for a variable or expression
- intermediate value, the compiler must be able to prove that the value is always
- of a particular type having a non-descriptor representation. Type inference
- (*Note Type Inference::) often needs some help from user-supplied
- declarations. The best kind of type declaration is a variable type declaration
- placed at the binding point:
-
- (let ((x (car l)))
- (declare (single-float x))
- ...)
-
- Use of `the', or of variable declarations not at the binding form is
- insufficient to allow non-descriptor representation of the variable -- with
- these declarations it is not certain that all values of the variable are of the
- right type. It is sometimes useful to introduce a gratuitous binding that
- allows the compiler to change to a non-descriptor representation, like:
-
- (etypecase x
- ((signed-byte 32)
- (let ((x x))
- (declare (type (signed-byte 32) x))
- ...))
- ...)
-
- The declaration on the inner `x' is necessary here due to a phase
- ordering problem. Although the compiler will eventually prove that the
- outer `x' is a `(signed-byte 32)' within that `etypecase' branch, the
- inner `x' would have been optimized away by that time. Declaring the
- type makes let optimization more cautious.
-
- Note that storing a value into a global (or `special') variable always
- forces a descriptor representation. Wherever possible, you should
- operate only on local variables, binding any referenced globals to local
- variables at the beginning of the function, and doing any global
- assignments at the end.
-
- Efficiency notes signal use of inefficient representations, so programmer's
- needn't continuously worry about the details of representation selection (?.)
-
- File: cmu-user.info Node: Generic Arithmetic, Prev: Variables, Up: Numbers, Next: Fixnums
-
- Generic Arithmetic
- ------------------
-
-
- In CMU Common Lisp, arithmetic operations are GENERIC. (9) (*Note
- Generic Arithmetic-Footnotes::) The `+' function can be passed
- `fixnum's, `bignum's, `ratio's, and various kinds of `float's and
- `complex'es, in any combination. In addition to the inherent complexity
- of `bignum' and `ratio' operations, there is also a lot of overhead in
- just figuring out which operation to do and what contagion and
- canonicalization rules apply. The complexity of generic arithmetic is
- so great that it is inconceivable to open code it. Instead, the
- compiler does a function call to a generic arithmetic routine, consuming
- many instructions before the actual computation even starts.
-
- This is ridiculous, since even Common Lisp programs do a lot of arithmetic, and the
- hardware is capable of doing operations on small integers and floats with a
- single instruction. To get acceptable efficiency, the compiler special-cases
- uses of generic arithmetic that are directly implemented in the hardware. In
- order to open code arithmetic, several constraints must be met:
-
- * All the arguments must be known to be a good type of number.
-
- * The result must be known to be a good type of number.
-
- * Any intermediate values such as the result of `(+ a b)' in the call
- `(+ a b c)' must be known to be a good type of number.
-
- * All the above numbers with good types must be of the SAME good
- type. Don't try to mix integers and floats or different float
- formats.
-
-
- The "good types" are `(signed-byte 32)', `(unsigned-byte 32)',
- `single-float' and `double-float'. See sections ?, ? and ? for more
- discussion of good numeric types.
-
- `float' is not a good type, since it might mean either `single-float' or
- `double-float'. `integer' is not a good type, since it might mean
- `bignum'. `rational' is not a good type, since it might mean `ratio'.
- Note however that these types are still useful in declarations, since
- type inference may be able to strengthen a weak declaration into a good
- one, when it would be at a loss if there was no declaration at all
- (*Note Type Inference::). The `integer' and `unsigned-byte' (or
- non-negative integer) types are especially useful in this regard, since
- they can often be strengthened to a good integer type.
-
- Arithmetic with `complex' numbers is inefficient in comparison to float and
- integer arithmetic. Complex numbers are always represented with a pointer
- descriptor (causing consing overhead), and complex arithmetic is always closed
- coded using the general generic arithmetic functions. But arithmetic with
- complex types such as:
-
- (complex float)
- (complex fixnum)
-
- is still faster than `bignum' or `ratio' arithmetic, since the
- implementation is much simpler.
-
- Note: don't use `/' to divide integers unless you want the overhead of
- rational arithmetic. Use `truncate' even when you know that the
- arguments divide evenly.
-
- You don't need to remember all the rules for how to get open-coded
- arithmetic, since efficiency notes will tell you when and where there is
- a problem -- ?.
-
-
- File: cmu-user.info Node: Generic Arithmetic-Footnotes, Up: Generic Arithmetic
-
- (9) As Steele notes in CLTL II, this is a generic conception of generic,
- and is not to be confused with the CLOS concept of a generic function.
-
- File: cmu-user.info Node: Fixnums, Prev: Generic Arithmetic, Up: Numbers, Next: Word Integers
-
- Fixnums
- -------
-
-
- A fixnum is a "FIXed precision NUMber". In modern Common Lisp
- implementations, fixnums can be represented with an immediate
- descriptor, so operating on fixnums requires no consing or memory
- references. Clever choice of representations also allows some
- arithmetic operations to be done on fixnums using hardware supported
- word-integer instructions, somewhat reducing the speed penalty for using
- an unnatural integer representation.
-
- It is useful to distinguish the `fixnum' type from the fixnum
- representation of integers. In Python, there is absolutely nothing
- magical about the `fixnum' type in comparison to other finite integer
- types. `fixnum' is equivalent to (is defined with `deftype' to be)
- `(signed-byte 30)'. `fixnum' is simply the largest subset of integers
- that can be represented using an immediate fixnum descriptor.
-
- Unlike in other CMU Common Lisp compilers, it is in no way desirable to use the
- `fixnum' type in declarations in preference to more restrictive integer types
- such as `bit', `(integer -43 7)' and `(unsigned-byte 8)'. Since
- Python does understand these integer types, it is preferable to use the more
- restrictive type, as it allows better type inference (*Note Operation Specific Type Inference::.)
-
- The small, efficient fixnum is contrasted with bignum, or "BIG NUMber".
- This is another descriptor representation for integers, but this time a
- pointer representation that allows for arbitrarily large integers.
- Bignum operations are less efficient than fixnum operations, both
- because of the consing and memory reference overheads of a pointer
- descriptor, and also because of the inherent complexity of extended
- precision arithmetic. While fixnum operations can often be done with a
- single instruction, bignum operations are so complex that they are
- always done using generic arithmetic.
-
- A crucial point is that the compiler will use generic arithmetic if
- it can't PROVE that all the arguments, intermediate values, and results are
- fixnums. With bounded integer types such as `fixnum', the result type proves
- to be especially problematical, since these types are not closed under
- common arithmetic operations such as `+', `-', `*' and `/'. For
- example, `(1+ (the fixnum x))' does not necessarily evaluate to a
- `fixnum'. Bignums were added to Common Lisp to get around this problem, but they
- really just transform the correctness problem "if this add overflows, you will
- get the wrong answer" to the efficiency problem "if this add MIGHT overflow
- then your program will run slowly (because of generic arithmetic.)"
-
- There is just no getting around the fact that the hardware only directly
- supports short integers. To get the most efficient open coding, the
- compiler must be able to prove that the result is a good integer type.
- This is an argument in favor of using more restrictive integer types:
- `(1+ (the fixnum x))' may not always be a `fixnum', but `(1+ (the
- (unsigned-byte 8) x))' always is. Of course, you can also assert the
- result type by putting in lots of `the' declarations and then compiling
- with `safety' `0'.
-
- File: cmu-user.info Node: Word Integers, Prev: Fixnums, Up: Numbers, Next: Floating Point Efficiency
-
- Word Integers
- -------------
-
-
- Python is unique in its efficient implementation of arithmetic
- on full-word integers through non-descriptor representations and open coding.
- Arithmetic on any subtype of these types:
-
- (signed-byte 32)
- (unsigned-byte 32)
-
- is reasonably efficient, although subtypes of `fixnum' remain somewhat
- more efficient.
-
- If a word integer must be represented as a descriptor, then the `bignum'
- representation is used, with its associated consing overhead. The
- support for word integers in no way changes the language semantics, it
- just makes arithmetic on small bignums vastly more efficient. It is
- fine to do arithmetic operations with mixed `fixnum' and word integer
- operands; just declare the most specific integer type you can, and let
- the compiler decide what representation to use.
-
- In fact, to most users, the greatest advantage of word integer
- arithmetic is that it effectively provides a few guard bits on the
- fixnum representation. If there are missing assertions on intermediate
- values in a fixnum expression, the intermediate results can usually be
- proved to fit in a word. After the whole expression is evaluated, there
- will often be a fixnum assertion on the final result, allowing creation
- of a fixnum result without even checking for overflow.
-
- The remarks in section *Note Fixnums:: about fixnum result type also
- apply to word integers; you must be careful to give the compiler enough
- information to prove that the result is still a word integer. This
- time, though, when we blow out of word integers we land in into generic
- bignum arithmetic, which is much worse than sleazing from `fixnum's to
- word integers. Note that mixing `(unsigned-byte 32)' arguments with
- arguments of any signed type (such as `fixnum') is a no-no, since the
- result might not be unsigned.
-
- File: cmu-user.info Node: Floating Point Efficiency, Prev: Word Integers, Up: Numbers, Next: Specialized Arrays
-
- Floating Point Efficiency
- -------------------------
-
-
- Arithmetic on objects of type `single-float' and `double-float' is
- efficiently implemented using non-descriptor representations and open
- coding. As for integer arithmetic, the arguments must be known to be of
- the same float type. Unlike for integer arithmetic, the results and
- intermediate values usually take care of themselves due to the rules of
- float contagion, i.e. `(1+ (the single-float x))' is always a
- `single-float'.
-
- Although they are not specially implemented, `short-float' and
- `long-float' are also acceptable in declarations, since they are
- synonyms for the `single-float' and `double-float' types, respectively.
- It is harmless to use list-style float type specifiers such as
- `(single-float 0.0 1.0)', but Python currently makes little use of
- bounds on float types.
-
- When a float must be represented as a descriptor, a pointer
- representation is used, creating consing overhead. For this reason, you
- should try to avoid situations (such as full call and non-specialized
- data structures) that force a descriptor representation. See sections ?
- and ?.
-
- *Note Floats:: for information on the extensions to support IEEE
- floating point.
-
- File: cmu-user.info Node: Specialized Arrays, Prev: Floating Point Efficiency, Up: Numbers, Next: Interactions With Local Call
-
- Specialized Arrays
- ------------------
-
-
- CMU Common Lisp supports specialized array element types through the :element-type
- argument to `make-array'. When an array has a specialized element type, only
- elements of that type can be stored in the array. From this restriction comes
- two major efficiency advantages:
-
- * A specialized array can save space by packing multiple elements
- into a single word. For example, a `base-char' array can have 4
- elements per word, and a `bit' array can have 32. This
- space-efficient representation is possible because it is not
- necessary to separately indicate the type of each element.
-
- * The elements in a specialized array can be given the same
- non-descriptor representation as the one used in registers and on
- the stack, eliminating the need for representation conversions when
- reading and writing array elements. For objects with pointer
- descriptor representations (such as floats and word integers) there
- is also a substantial consing reduction because it is not necessary
- to allocate a new object every time an array element is modified.
-
-
-
- These are the specialized element types currently supported:
-
- bit
- (unsigned-byte 2)
- (unsigned-byte 4)
- (unsigned-byte 8)
- (unsigned-byte 16)
- (unsigned-byte 32)
- base-character
- single-float
- double-float
-
- Although a `simple-vector' can hold any type of object, true should
- still be considered a specialized array type, since arrays with element
- type true are specialized to hold descriptors.
-
- When using non-descriptor representations, it is particularly important
- to make sure that array accesses are open-coded, since in addition to
- the generic operation overhead, efficiency is lost when the array
- element is converted to a descriptor so that it can be passed to (or
- from) the generic access routine. You can detect inefficient array
- accesses by enabling efficiency notes, ?. *Note Arrays::.
-
- File: cmu-user.info Node: Interactions With Local Call, Prev: Specialized Arrays, Up: Numbers, Next: Representation of Characters
-
- Interactions With Local Call
- ----------------------------
-
-
- Local call has many advantages (*Note Local Call::); one relevant to
- our discussion here is that local call extends the usefulness of non-descriptor
- representations. If the compiler knows from the argument type that an argument
- has a non-descriptor representation, then the argument will be passed in that
- representation. The easiest way to ensure that the argument type is known at
- compile time is to always declare the argument type in the called function,
- like:
-
- (defun 2+f (x)
- (declare (single-float x))
- (+ x 2.0))
-
- The advantages of passing arguments and return values in a
- non-descriptor representation are the same as for non-descriptor
- representations in general: reduced consing and memory access (*Note
- Non-Descriptor Representations::.) This extends the applicative
- programming styles discussed in section *Note Local Call:: to numeric
- code. Also, if source files are kept reasonably small, block
- compilation can be used to reduce number consing to a minimum.
-
- Note that non-descriptor return values can only be used with the known return
- convention (section *Note Return Values::.) If the compiler can't prove that
- a function always returns the same number of values, then it must use the
- unknown values return convention, which requires a descriptor representation.
- Pay attention to the known return efficiency notes to avoid number consing.
-
- File: cmu-user.info Node: Representation of Characters, Prev: Interactions With Local Call, Up: Numbers
-
- Representation of Characters
- ----------------------------
-
-
- Python also uses a non-descriptor representation for characters when
- convenient. This improves the efficiency of string manipulation, but is
- otherwise pretty invisible; characters have an immediate descriptor
- representation, so there is not a great penalty for converting a
- character to a descriptor. Nonetheless, it may sometimes be helpful to
- declare character-valued variables as `base-character'.
-
-
- File: cmu-user.info Node: General Efficiency Hints, Prev: Numbers, Up: Advanced Compiler Use and Efficiency Hints, Next: Efficiency Notes
-
- General Efficiency Hints
- ========================
-
-
- This section is a summary of various implementation costs and ways to
- get around them. These hints are relatively unrelated to the use of the
- Python compiler, and probably also apply to most other Common Lisp
- implementations. In each section, there are references to related
- in-depth discussion.
-
- * Menu:
-
- * Compile Your Code::
- * Avoid Unnecessary Consing::
- * Complex Argument Syntax::
- * Mapping and Iteration::
- * Trace Files and Disassembly::
-
-
- File: cmu-user.info Node: Compile Your Code, Prev: General Efficiency Hints, Up: General Efficiency Hints, Next: Avoid Unnecessary Consing
-
- Compile Your Code
- -----------------
-
-
- At this point, the advantages of compiling code relative to running it
- interpreted probably need not be emphasized too much, but remember that
- in CMU Common Lisp, compiled code typically runs hundreds of times
- faster than interpreted code. Also, compiled (`fasl') files load
- significantly faster than source files, so it is worthwhile compiling
- files which are loaded many times, even if the speed of the functions in
- the file is unimportant.
-
- Even disregarding the efficiency advantages, compiled code is as good or
- better than interpreted code. Compiled code can be debugged at the
- source level (see chapter *Note The Debugger::), and compiled code does
- more error checking. For these reasons, the interpreter should be
- regarded mainly as an interactive command interpreter, rather than as a
- programming language implementation.
-
- Do not be concerned about the performance of your program until you see
- its speed compiled. Some techniques that make compiled code run faster
- make interpreted code run slower.
-
- File: cmu-user.info Node: Avoid Unnecessary Consing, Prev: Compile Your Code, Up: General Efficiency Hints, Next: Complex Argument Syntax
-
- Avoid Unnecessary Consing
- -------------------------
-
-
- Consing is another name for allocation of storage, as done by
- the `cons' function (hence its name.) `cons' is by no means the only
- function which conses -- so does `make-array' and many other functions.
- Arithmetic and function call can also have hidden consing overheads. Consing
- hurts performance in the following ways:
-
- * Consing reduces memory access locality, increasing paging activity.
-
- * Consing takes time just like anything else.
-
- * Any space allocated eventually needs to be reclaimed, either by
- garbage collection or by starting a new `lisp' process.
-
-
-
- Consing is not undiluted evil, since programs do things other than
- consing, and appropriate consing can speed up the real work. It would
- certainly save time to allocate a vector of intermediate results that
- are reused hundreds of times. Also, if it is necessary to copy a large
- data structure many times, it may be more efficient to update the data
- structure non-destructively; this somewhat increases update overhead,
- but makes copying trivial.
-
- Note that the remarks in section *Note Writing Efficient Code:: about
- the importance of separating tuning from coding also apply to consing
- overhead. The majority of consing will be done by a small portion of
- the program. The consing hot spots are even less predictable than the
- CPU hot spots, so don't waste time and create bugs by doing unnecessary
- consing optimization. During initial coding, avoid unnecessary
- side-effects and cons where it is convenient. If profiling reveals a
- consing problem, THEN go back and fix the hot spots.
-
- *Note Non-Descriptor Representations:: for a discussion of how to avoid
- number consing in Python.
-
-
- File: cmu-user.info Node: Complex Argument Syntax, Prev: Avoid Unnecessary Consing, Up: General Efficiency Hints, Next: Mapping and Iteration
-
- Complex Argument Syntax
- -----------------------
-
-
- Common Lisp has very powerful argument passing mechanisms. Unfortunately, two
- of the most powerful mechanisms, rest arguments and keyword arguments, have a
- significant performance penalty:
-
- * With keyword arguments, the called function has to parse the
- supplied keywords by iterating over them and checking them against
- the desired keywords.
-
- * With rest arguments, the function must cons a list to hold the
- arguments. If a function is called many times or with many
- arguments, large amounts of memory will be allocated.
-
-
- Although rest argument consing is worse than keyword parsing, neither
- problem is serious unless thousands of calls are made to such a
- function. The use of keyword arguments is strongly encouraged in
- functions with many arguments or with interfaces that are likely to be
- extended, and rest arguments are often natural in user interface
- functions.
-
- Optional arguments have some efficiency advantage over keyword
- arguments, but their syntactic clumsiness and lack of extensibility has
- caused many CMU Common Lisp programmers to abandon use of optionals
- except in functions that have obviously simple and immutable interfaces
- (such as `subseq'), or in functions that are only called in a few
- places. When defining an interface function to be used by other
- programmers or users, use of only required and keyword arguments is
- recommended.
-
- Parsing of `defmacro' keyword and rest arguments is done at compile
- time, so a macro can be used to provide a convenient syntax with an
- efficient implementation. If the macro-expanded form contains no
- keyword or rest arguments, then it is perfectly acceptable in inner
- loops.
-
- Keyword argument parsing overhead can also be avoided by use of inline
- expansion (*Note Inline Expansion::) and block compilation (section
- *Note Block Compilation::.)
-
- Note: the compiler open-codes most heavily used system functions which
- have keyword or rest arguments, so that no run-time overhead is
- involved.
-
- File: cmu-user.info Node: Mapping and Iteration, Prev: Complex Argument Syntax, Up: General Efficiency Hints, Next: Trace Files and Disassembly
-
- Mapping and Iteration
- ---------------------
-
-
- One of the traditional Common Lisp programming styles is a highly applicative one,
- involving the use of mapping functions and many lists to store intermediate
- results. To compute the sum of the square-roots of a list of numbers, one
- might say:
-
- (apply #'+ (mapcar #'sqrt list-of-numbers))
-
-
- This programming style is clear and elegant, but unfortunately results
- in slow code. There are two reasons why:
-
- * The creation of lists of intermediate results causes much consing
- (see *Note Avoid Unnecessary Consing::).
-
- * Each level of application requires another scan down the list.
- Thus, disregarding other effects, the above code would probably
- take twice as long as a straightforward iterative version.
-
-
-
- An example of an iterative version of the same code:
-
- (do ((num list-of-numbers (cdr num))
- (sum 0 (+ (sqrt (car num)) sum)))
- ((null num) sum))
-
-
- See sections *Note Variable Type Inference:: and *Note Let
- Optimization:: for a discussion of the interactions of iteration
- constructs with type inference and variable optimization. Also, section
- *Note Local Tail Recursion:: discusses an applicative style of
- iteration.
-
- File: cmu-user.info Node: Trace Files and Disassembly, Prev: Mapping and Iteration, Up: General Efficiency Hints
-
- Trace Files and Disassembly
- ---------------------------
-
-
- In order to write efficient code, you need to know the relative costs of
- different operations. The main reason why writing efficient Common Lisp
- code is difficult is that there are so many operations, and the costs of
- these operations vary in obscure context-dependent ways. Although
- efficiency notes point out some problem areas, the only way to ensure
- generation of the best code is to look at the assembly code output.
-
- The `disassemble' function is a convenient way to get the assembly code
- for a function, but it can be very difficult to interpret, since the
- correspondence with the original source code is weak. A better (but
- more awkward) option is to use the :trace-file argument to
- `compile-file' to generate a trace file.
-
- A trace file is a dump of the compiler's internal representations, including
- annotated assembly code. Each component in the program gets three pages in
- the trace file (separated by "`^L'"):
-
- * The implicit-continuation (or IR1) representation of the optimized
- source. This is a dump of the flow graph representation used for
- "source level" optimizations. As you will quickly notice, it is
- not really very close to the source. This representation is not
- very useful to even sophisticated users.
-
- * The Virtual Machine (VM, or IR2) representation of the program.
- This dump represents the generated code as sequences of "Virtual
- OPerations" (VOPs.) This representation is intermediate between
- the source and the assembly code -- each VOP corresponds fairly
- directly to some primitive function or construct, but a given VOP
- also has a fairly predictable instruction sequence. An operation
- (such as `+') may have multiple implementations with different cost
- and applicability. The choice of a particular VOP such as
- `+/fixnum' or `+/single-float' represents this choice of
- implementation. Once you are familiar with it, the VM
- representation is probably the most useful for determining what
- implementation has been used.
-
- * An assembly listing, annotated with the VOP responsible for
- generating the instructions. This listing is useful for figuring
- out what a VOP does and how it is implemented in a particular
- context, but its large size makes it more difficult to read.
-
-
-
- Note that trace file generation takes much space and time, since the
- trace file is tens of times larger than the source file. To avoid huge
- confusing trace files and much wasted time, it is best to separate the
- critical program portion into its own file and then generate the trace
- file from this small file.
-
-
- File: cmu-user.info Node: Efficiency Notes, Prev: General Efficiency Hints, Up: Advanced Compiler Use and Efficiency Hints, Next: Profiling
-
- Efficiency Notes
- ================
-
-
- Efficiency notes are messages that warn the user that the compiler has
- chosen a relatively inefficient implementation for some operation.
- Usually an efficiency note reflects the compiler's desire for more type
- information. If the type of the values concerned is known to the
- programmer, then additional declarations can be used to get a more
- efficient implementation.
-
- Efficiency notes are controlled by the `extensions:inhibit-warnings'
- optimization quality (*Note The Optimize Declaration::.) When `speed'
- is greater than `extensions:inhibit-warnings', efficiency notes are
- enabled. Note that this implicitly enables efficiency notes whenever
- `speed' is increased from its default of `1'.
-
- Consider this program with an obscure missing declaration:
-
- (defun eff-note (x y z)
- (declare (fixnum x y z))
- (the fixnum (+ x y z)))
-
- If compiled with `(speed 3) (safety 0)', this note is given:
-
- In: DEFUN EFF-NOTE
- (+ X Y Z)
- ==>
- (+ (+ X Y) Z)
- Note: Forced to do inline (signed-byte 32) arithmetic (cost 3).
- Unable to do inline fixnum arithmetic (cost 2) because:
- The first argument is a (INTEGER -1073741824 1073741822),
- not a FIXNUM.
-
- This efficiency note tells us that the result of the intermediate computation
- `(+ x y)' is not known to be a `fixnum', so the addition of the
- intermediate sum to `z' must be done less efficiently. This can be fixed by
- changing the definition of `eff-note':
-
- (defun eff-note (x y z)
- (declare (fixnum x y z))
- (the fixnum (+ (the fixnum (+ x y)) z)))
-
-
- * Menu:
-
- * Type Uncertainty::
- * Efficiency Notes and Type Checking::
- * Representation Efficiency Notes::
- * Verbosity Control::
-
-
- File: cmu-user.info Node: Type Uncertainty, Prev: Efficiency Notes, Up: Efficiency Notes, Next: Efficiency Notes and Type Checking
-
- Type Uncertainty
- ----------------
-
-
- The main cause of inefficiency is the compiler's lack of adequate
- information about the types of function argument and result values.
- Many important operations (such as arithmetic) have an inefficient
- general (generic) case, but have efficient implementations that can
- usually be used if there is sufficient argument type information.
-
- Type efficiency notes are given when a value's type is uncertain. There is an
- important distinction between values that are not known to be of a good
- type (uncertain) and values that are known not to be of a good type.
- Efficiency notes are given mainly for the first case (uncertain types.) If it
- is clear to the compiler that that there is not an efficient implementation for
- a particular function call, then an efficiency note will only be given if the
- `extensions:inhibit-warnings' optimization quality is `0' (*Note The Optimize Declaration::.)
-
- In other words, the default efficiency notes only suggest that you add
- declarations, not that you change the semantics of your program so that an
- efficient implementation will apply. For example, compilation of this form
- will not give an efficiency note:
-
- (elt (the list l) i)
-
- even though a vector access is more efficient than indexing a list.
-
- File: cmu-user.info Node: Efficiency Notes and Type Checking, Prev: Type Uncertainty, Up: Efficiency Notes, Next: Representation Efficiency Notes
-
- Efficiency Notes and Type Checking
- ----------------------------------
-
-
- It is important that the `eff-note' example above used
- `(safety 0)'. When type checking is enabled, you may get apparently
- spurious efficiency notes. With `(safety 1)', the note has this extra
- line on the end:
-
- The result is a (INTEGER -1610612736 1610612733), not a FIXNUM.
-
- This seems strange, since there is a `the' declaration on the result of
- that second addition.
-
- In fact, the inefficiency is real, and is a consequence of Python's
- treating declarations as assertions to be verified. The compiler can't
- assume that the result type declaration is true -- it must generate the
- result and then test whether it is of the appropriate type.
-
- In practice, this means that when you are tuning a program to run without type
- checks, you should work from the efficiency notes generated by unsafe
- compilation. If you want code to run efficiently with type checking, then you
- should pay attention to all the efficiency notes that you get during safe
- compilation. Since user supplied output type assertions (e.g., from `the')
- are disregarded when selecting operation implementations for safe code, you
- must somehow give the compiler information that allows it to prove that the
- result truly must be of a good type. In our example, it could be done by
- constraining the argument types more:
-
- (defun eff-note (x y z)
- (declare (type (unsigned-byte 18) x y z))
- (+ x y z))
-
- Of course, this declaration is acceptable only if the arguments to
- `eff-note' always ARE `(unsigned-byte 18)' integers.
-
-